[SPARK-22811][pyspark][ml] Fix pyspark.ml.tests failure when Hive is not available.#19997
[SPARK-22811][pyspark][ml] Fix pyspark.ml.tests failure when Hive is not available.#19997MrBago wants to merge 1 commit intoapache:masterfrom
Conversation
|
@HyukjinKwon I think you might have touched that code last |
|
Test build #84980 has finished for PR 19997 at commit
|
|
@MrBago, I think you can just skip when Hive support is disabled if this matters.That test is valid only with a Hive support. |
| import numpy as np | ||
| from numpy import abs, all, arange, array, array_equal, inf, ones, tile, zeros | ||
| import inspect | ||
| import py4j |
There was a problem hiding this comment.
BTW, mind elaborating how importing this fixes an issue? It sounds orthogonal to me.
There was a problem hiding this comment.
On the line below, we catch py4j.protocol.Py4JError so that we can then raise SkipTest instead, but if we don't import py4j we get a NameError instead of skipping the test. Furthermore, because we don't trigger tearDownClass() on the following line we leave behind stale state which causes other tests to fail. The except line is only ever triggered in environments that don't have Hive and should skip this test.
There was a problem hiding this comment.
Ah, it was my bad. Yup, you are right.
It was also written in the JIRA as well. Sorry, I just got up (I am in Korea :) ..) and rushed to leave some comments.
|
Merged to master. |
What changes were proposed in this pull request?
pyspark.ml.tests is missing a py4j import. I've added the import and fixed the test that uses it. This test was only failing when testing without Hive.
How was this patch tested?
Existing tests.
Please review http://spark.apache.org/contributing.html before opening a pull request.